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Abstract. We give a canonical representation for trim acyclic determin- 
istic finite automata (ADFA) with n states over an alphabet of k sym- 
bols. Using this normal form, we present a backtracking algorithm for the 
exact generation of ADFAs. This algorithm is a non trivial adaptation of 
the algorithm for the exact generation of minimal acyclic deterministic 
finite automata (MADFA), presented by Almeida et al.. 

1 Introduction 

Recently, Liskovets [10] obtained a formula for the enumeration of unlabellcd 
(non-isomorphic) initially connected acyclic deterministic finite automata with 
n states over an alphabet of k symbols. Callan [4] presented a canonical form for 
those automata and showed that a certain determinant of Stirling cycle num- 
bers can also count them. That canonical form is obtained by observing that if 
we mark the visited states, starting with the initial state, it is always possible 
to find a state whose only incident states are already marked. This induces a 
unique labelling of states, but it is not clear how these representations can be 
used in automata generation. Almeida et al. [2] obtained a canonical form for 
(non-isomorphic) minimal acyclic deterministic finite automata (MADFA) and 
an exact generation algorithm. Unfortunely the canonical form did not provide 
directly an enumeration formula for MADFAs. One of the applications of such 
an enumeration formula would be in the development of uniform random gen- 
erators of automata, useful for the average case analysis of algorithms for that 
class of automata. The enumeration of different kinds of finite automata was 
considered by several authors since late 1950s. For more complete surveys we 
refer the reader to Domaratzki et al. [7] and to Domaratzki [6]. Liskovets [9] and 
Robinson [14] counted non-isomorphic initially connected deterministic finite 
automata (ICDFA). More recently, several authors examined related problems. 
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Domaratzki et al. [7] studied the (exact and asymptotic) enumeration of distinct 
languages accepted by finite automata with n states. Nicaud [12], Champar- 
naud and Paranthoen [5] presented a method for randomly generating complete 
ICDFA's. Bassino and Nicaud [3] showed that the number of complete ICDFA's 
is 0(n2 n S(kn, n)), where S(kn, n) is a Stirling number of the second kind. Based 
on a canonical string representation for ICDFA's, Almeida et al. [1] obtained 
a new formula for the number of non-isomorphic ICDFA's, and provided exact 
and uniform random generators for them. 

In this paper, we give a canonical representation for trim (complete) acyclic 
deterministic finite automata (ADFA). By trim we mean that from the initial 
state all other states are reachable (initially connected) and that from all states 
(but the dead) at least one final state is reachable (useful). 

This canonical form extends the one for MADFAs by taking into consider- 
ation equivalent states. The backtracking algorithm for the exact generation of 
ADFAs is a non-trivial adaptation of the one for MADFAs, because we must 
properly consider the equivalence classes but still avoid the multiple genera- 
tion of isomorphic automata. It is easy to order equivalent states according to 
the words that reach them (i.e., their left languages) but to obtain a feasible 
generator algorithm we must find an ordering such that: 

— the canonical representation for ADFAs is a natural extension of the canon- 
ical representation for MADFAs (i.e., preserves its characteristics); 

— it allows the detection of an ill-formed automata representation as soon as 
possible (as the algorithm proceeds backwards, towards the initial state); 

— it allows the exact generation algorithm to output the automata canonical 
representations in increasing order. 

ADFAs, as defined here, are a proper subset of the class of acyclic automata 
enumerated by Liskovets and Callan because we only consider automata where 
all the states are useful. Once more, their formulae can not be used directly, but 
in this paper we hope to contribute to a better understanding of the internal 
structure of ADFAs. 

The paper is organized as follows. In the next section some basic concepts 
and notation are introduced. In Section 3 we review some concepts about acyclic 
deterministic finite automata and the canonical form for MADFAs. In Section 4 
we show how to extend that canonical form to ADFAs. In Section 5 we describe 
an algorithm to efficiently generate equivalent states as an extension to the exact 
generator for MADFAs. Some experimental results are also summarized in that 
section. In Section 6 we consider ADFAs enumeration formulas for small values 
of n and k. Finally Section 7 concludes. 

2 Basic concepts 

We review some basic concepts we need in this paper. For more details we refer 
the reader to Hopcroft et al. [8], Yu [15] or Lothaire [11]. 



Let [n,m] denote the set {i G Z | n < i < m}. In a similar way, we consider 
the variants ]n,m], [n, m[ and ]n, m[. Whenever we have a finite ordered set 
A, and a function / on A, the expression (f(a)) a& A denote the values of / for 
increasing values of A. 

Let E be an alphabet and E* be the set of all words over E. The empty word 
is denoted by e. The length of a word x = o\Oi ■ ■ ■ <j ni denoted by |x|, is n. A 
language L is a subset of E* . A language is finite if its cardinality is finite. 

The alphabet E can be equipped with a total order < that allows the defi- 
nition of total orders on E* . A lexicographical order on E* is defined as follows. 
Let x = x\. . . x m , y = y\ . . . y n G E*. Then x < y if: 

1. there exists an integer j G [1, min{™, n}} such that (Vi G [1, j[) Xi = yi and 

2. m < n and (Vi £ [l,m])xj = j/i. 

A deterministic finite automaton (DFA) ^4 is a tuple (S, S,8, sq, F) where 
S is a finite set of states, E is the alphabet, 5 : S x S — > 5 is the transition 
function, s the initial state and F C S the set of final states. 

We assume that the transition function is total, so we consider only com- 
plete DFAs. The transition function S is inductively extended to S*, by (Vs £ 
S) S(s, e) — s and 8(s, xcr) — 5(5(s, x),<j). 

A DFA is initially connected (or accessible) (ICDFA) if for each state s G S 
there exists a word x £ S* such that 5(so,a;) = s. A DFA is frim if it is an 
ICDFA and every state is useful, i.e., (Vs £ S , )(3a; e £*) 5(s,x) G F. 

Two DFAs (5, Z\ (5, s , F) and (5', <5', s' , F') are called isomorphic if = 
|Z"| = k, there exist bijections TTi : £ -> [0, fc - 1], 7T 2 : Z" -> [0, fc - 1] and 
a bijection t : 5 — > S' such that i(so) = s' Q} i(F) — F', and for all a G Z and 
seS, t(S(s,a))=S'(i(s),n^(n 1 (a))). 

The language accepted by a DFA A is C(A) = {x G S* \ S(s Q ,x) G F}. For 
a state s E S we denote 

£l(A s ) = {a; G Z* | <5(s ,x) = s}, 
s) ={ier (5(s, x) G F}, 

the Ze/f and the right language of state s, respectively. We omit A whenever no 
confusion arises. All states of a DFA have distinct left languages. 

Two DFAs are equivalent if they accept the same language. We say that two 
states s and s' are equivalent, s ~ s', if and only if Cr(A 7 s) — jCr(A, s'). A DFA 
is minimal if it has no equivalent states and it is initially-connected. Minimal 
DFAs are unique up to isomorphism. 

3 Acyclic finite automata 

An acyclic deterministic finite automaton is a DFA A = (S U {>!?}, S, 5, sq, F) 
with F C S and s + n such that (Vcr G E) 5{Q, a) = [2 and (Vx G Z*\{e})(Vs G 
S) S(s, x) ^ s. The state £2 is called the dead state, and is the only cyclic state 



of A. The size of A is \S\. We are going to consider only trim complete acyclic 
deterministic finite automata (ADFA), where all states but fi are useful. It is 
obvious that the language of an ADFA is finite. 

A state s G S is called pre-dead if (Vcr G S) 5(s, a) — J?. Every ADFA has 
at least a pre-dead state and all pre-dead states are final. 

Given an ADFA, A = (S U {Q}, S, S, s , F), the rank of a state s G 5, 
denoted rk(s), is the length of the longest word x G S* such that S(s,x) G F 
(i.e., x G £ji(As)). The ranA; 1 of an ADFA ^ , rk(^l), is max{rk(s) | s G S}. 
Trivially, we have that rk(so) = rk(^l) and rk(s) = 0, for all pre-dead states s. 

For every state s G S, with rk(s) > there exists a transition to a state with 
rank immediately lower than s's. 

Lemma 1. Let A = (S U {J?}, E, S, s , F) be an ADFA, then 

(V.s G 5)(rk(s) (3cr G S) rk(5(s, cr)) = rk(s) - f ). 

Two states s and s' are mergeable if they are both either final or not final, and 
the transition function is identical, i.e., 

(s G F s' G F) A (Vct G E) S(s, ct) = 8(s', cr). 

For instance, in the ADFA of Figure 1 the states s 2 an d s 3 are mergeable, 
and S7 and s 8 are mergeable too. 




Fig. 1. An ADFA. 



An ADFA can be minimized by merging mergeable states, thus, a minimal 
ADFA (MADFA) can be characterized by: 

1 Also called the diameter of A. 



Lemma 2 ([13, 11]). An ADFA A = (S U {H}, E, S, s , F) is minimal if and 
only if it has no mergeable states. 



It is a direct consequence of Lemma 2 that every MADFA has a unique pre- 
dead state, n £ S, and that mergeable states have the same rank. This implies 
that to minimize an ADFA it is only necessary to merge states by increasing 
rank order (see Revuz [13] or Lothairc [11]). 

3.1 A normal form for MADFAs 

Based upon the above considerations, Almeida et al. [2] presented a canonical 
representation for MADFAs. 

Let A = (S U {i?}, E, 5, s , F) be a MADFA with k = \E\ and n=\S\> 2. 
Consider a total order over E and let 77 : E — ► [0, k\ be the bijection induced 
by that order. Let Ri = {s € S \ rk(s) = I}. It is possible to obtain a canonical 
numbering of the states ip : SU{f2} — > [0, n] proceeding by increasing rank order 
and considering an ordering over the (k + l)-tuples that represent the transition 
function and the finality of each state. For each state s £ S, let its representation 
be a (fc + l)-tuple A{s) = {ip{S{s, TZ" 1 ^))), . . . , ip(8(s, TT" 1 ^ - 1))), /), where 
the first k values represent the transitions from state s and the last value, /, is 1 
if s £ F or 0, otherwise. Let ip(fl) = and ip(n) = 1. Thus, the representations 
of fl and 7r are (0 fe ,0), and (0 fe , 1), respectively. We can continue this process 
considering the states by increasing rank order, and in each rank we number the 
states by lexicographic order over their transition representations. It is important 
to note that transitions from a given state can only refer to states of a lower 
rank, and thus already numbered. The sequence of tuples (Z\(i)) ig [ ,n] is the 
canonical string representation of A. Formally, the assignment of state numbers, 
ip, can be described by the following simple algorithm: 

tp(O) ^0;^(tv) ^l;i^2 
for I in ]0,rk(^)] 

for s £ Ri by lexicographic order over A(s) 
<p(s) <- i 

In Figure 2, we present a MADFA (n = 7 and k = 3), the p function and its 
canonical representation. 

The characterization of these strings and that they constitute a canonical 
representation for MADFAs is given by the following theorem: 

Theorem 1 ([2, Thms.3-5.]). There exists a bijection between non-isomorphic 
MADFAs with n states and k symbols and the set of strings {si)i^\o,(k+i)(n+i)[> 
with Si £ [0, n[ that satisfy the following conditions. Let (/i)ie[i,n[ be the sequence 
of the positions in (si)i of the first occurrence of each i £ [1, n[. Let d < n and let 
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Fig. 2. An example of a MADFA that can be described by the canonical representation 
[[0, 0, 0, 0], [0, 0, 0, 1], [1, 1, 1, 0], [2, 1, 1, 0], [2, 3, 2, 0], [3, 3, 0, 0], [4, 0, 0, 0], [5, 6, 6, 0]]. 



( r i)ie[a.d] £ [1, n \ be the sequence of the first states of each rank in (sj)j. Then: 
so = ■ ■ ■ = Sk = ■ ■ ■ = S2k = A S2k+\ = 1 



• • = s k = ■ ■ ■ = s 2 k = A s 2 k+\ 
(Vie [0,n]) s (k+l)i+k 

£{0,1} 

r = 1 A n = 2 A r d = n A (VZ G [0, d[) r ; < r ;+1 
((Vi G [l,n[)s/ 4 = i A 

(Vj G [0, n])(Vm G [0, k[) ((k + l)j + m<f l 

(V/ g [o, d[)(Vi g fcr ;+1 + i<fi 

(VI G [0,d])(V« G [r ; ,n+i[)(3m G [0, fc[) s (fe+ i) i+m G 

(VI G [0,d[)(Vz G - 1[) (S(fe+l)i+m)me[0,fe] < ( s (fc+l)(i+l)+m)me[0,fc] 



(NO) 
(Nl) 
(N2) 

(N3) 

(N4) 
(N5) 



(N6) 

The condition NO gives the representation of the dead {Q) and the pre-dead 
state (it). The condition Nl states that the last symbol of each state represen- 
tation indicates if the state is final or not. The condition N2 ensures that states 
are numbered by increasing rank order. The condition N3 defines the sequence 
(/i)ie[i,n[; an d ensures that A is initially connected. The condition N4 is a direct 
consequence of the rank definition, i.e., a state can only refer to a state of a lower 
rank. The condition N5 states that every state has a transition to a state with 
rank immediately lower than its own. The condition N6 ensures that within a 
rank the state representations are lexicographically ordered. 



4 A normal form for ADFAs 



If an ADFA is not minimal, then it has at least two mergeable states, but 
not all equivalent states need to be mergeable. The two following lemmas give 



characterizations of equivalent states in an ADFA that will be used to obtain a 
canonical representation. 

Lemma 3. In an ADFA every two equivalent states must belong to the same 
rank. 

This follows directly from the definitions. 

Lemma 4. Let A = (S U {Q}, E, S, s , F) be an ADFA. For all s,s' G S, if 
s ~ s' then there exists w G U* , such that S(s,w) and 5(s',w) are mergeable 
states. 

Proof. If s ~ s' then (Vct G S),5(s,a) ~ 8(s',a). Suppose that there exists 
<7i G S such that s\ = 5(s, a\) ^ 5(s', a\) = s[. Because Si ~ s[ we can proceed 
as before, but because A is acyclic and |5| is finite this process must stop, and 
two mergeable states, Sj and s'j for j < \S\, must be reached. The concatenation 
of the <7i . . . (jj provides the word w. 

In order to have a canonical representation for ADFAs we must provide an 
ordering for the equivalent states. Because they must appear in the same rank 
we may restrict the state ordering by rank and consider a proper extension of 
the function ip (assignment of state numbers) , and so a proper extension of the 
canonical representation for MADFAs. In particular we take <p{D) — 0. Because 
ADFAs are deterministic, we have 

Lemma 5. Let A={SU {Q}, S, 8, s , F) be an ADFA. Then 

(Vs^s' G 5 U {f2}), C L (s) n C L (s') = 0. 

Any two different states can be distinguished, if we define any injective function 
'P : S — > O, where O must be a total ordered set. 

For instance, given an order over S we could have \P : S — ► S* given by 
'F(s) = min{u> | w G Cl(s)}, for s G S, where min is taken considering the 
lexicographical order on £*. Then, whenever two mergeable states s and s' were 
found, we could take s < s' if and only if ^(s) < \P(s') (lexicographically). 

In the general case, given an injective function let ^ be an ordering such 
that (Vs, s' G S),s <y s' if: 

1. A(s) < A(s'), where < is the lexicographical order; 

2. if A(s) = A(s') then &(s) < &(s'). 

The algorithm of page 5, that computes the function ip can be adapted for 
ADFAs by not considering the state ir, initializing i with 1 and considering the 
order Consider the ADFA of Figure 1. Its state ranks are the following: 
Ro = {s 7 ,ss}, Ri = {s6}, R2 = {54}, R3 = {si,s 5 }, R4, = {s 2 ,s 3 } and R 5 = 
{so}. Regarding the function above, the function if is defined by the following 
tuples: (s 7 , 1), (s 8 , 2), (s 6 , 3), (s 4 , 4), (s 5 , 5), (s 1 , 6), (s 3 , 7), (s 2 , 8), (s , 9). And, its 
string representation is 

[[0, 0, 0, 0] [0, 0, 0, 1] [0, 0, 0, 1] [1 , 1, 2, 0] [3, 1 , 1 , 0] [0, 1, 0, 0] [4, 4, 0, 0] [5, 0, 0, 0] [5, 0, 0, 0] [6, 7, 8, 0]] ; 



which is lexicographically ordered within a rank (i.e., respects condition N6, 
considering < instead of <). 

As we aim to obtain an exact generator that will proceed by increasing rank 
order, it is convenient that &(s) is related to a maximal word of Cl(s). To assure 
that in a rank the state representations are lexicographically ordered we also take 
into consideration the ranks and the finalities of the states. 

Let A = (S U {Q}, S, 5, s , F) be an ADFA, with S ordered. For each state 
s G S, let 5 _1 (s) — {(s', a) \ 8(s', a) — s}, and for (s' , a) G S~ 1 (s) let consider the 
tuple r = (rk(s'),a, f s >) with f s > = 1, if s' G F or 0, otherwise. We define C r ^{s) 
to be the set of sequences of these tuples to . ■ . n such that ai ■ ■ ■ oo € Cl(s)- 
The characteristic word of s, is 

& c (s) = min{ro . . . ti\t ■ ■ ■ n e C r L k (s)}, 
where min is taken lexicographically. 




Fig. 3. An ADFA which canonical string representation considering \l/ c is: 
[[0, 0, 0, 0] [0, 0, 0, 1] [0, 0, 0, 1] [1 , 0, 1, 0] [1 , 2, 0, 0] [3, 4, 2, 0]] . 

In the example of Figure 3, we have 3^(1) = la0260 and 3^(2) = Ia02a0 
which shows that the numbers assigned to these states must be reversed, i.e., 
ip(l) = 2 and ip(2) = 1. 

The following three theorems guarantee that this representation is indeed a 
canonical representation for ADFAs. 

Theorem 2. Let A = (S U {f2}, Z 1 , 5, s , F) be an ADFA with rk(A) = d, 
n = \S\ and k = \E\. Let (si)i e [o j (fe+i)(n+i)[; with Sj € [0,n[, 6e i/ie string 
representation of A obtained using the the function Then the conditions N0- 
N5 of Theorem 1 are satisfied, together with the following condition N6': 

(VZ g [0, d[)(Vi e Ri) i-<9 e i + 1. (N6 ! ) 
Proof. Follows from the above considerations. 

Theorem 3. Let (si)i e [o ! (fc+i)(n+i)[ Sj G [0, n[ &e a string that satisfies 

conditions N0-N5 and condition N6', then the corresponding automaton is an 
ADFA with n states and an alphabet of k symbols. 



Proof. From conditions N0-N5, we knew that we could obtain a trim complete 
acyclic deterministic finite automaton. The relaxation of condition N6 to condi- 
tion N6' allows some states to be mergeable. 

Theorem 4. Let (s»)j e [o,(fc+i)( n +i)[ and ( s Die[o,(fe+i)(™+i)[ be two distinct strings 
satisfying conditions N0-N5 and condition N6\ Then they correspond to distinct 
ADFAs. 

Proof. The proof follows exactly the lines of Theorem 5 in Almeida et al.[2], 
because of condition N6'. 

5 Exact generation of ADFAs 

To generate all the string representations of the ADFAs with n states and k 
symbols, we will use the same approach described by Almeida et al [2], traversing 
the search tree, backtracking on its way, to generate all possible representations. 
The representations will appear lexicographically. The conditions to generation 
are the same but with N6 replaced by its relaxed form N6'. The satisfaction of the 
conditions on the order of equivalent states is too complex to be included in the 
generation. When a pair of equivalent states is generated, instead of renumbering 
them according to the first word (for some order) that reaches each state, we 
proceed with the generation of all the states in lexicographical order of their 
A values, and discard the automata for which the previously stated order is 
contradicted. 

The problem with this strategy is that, with the "natural" lexicographical 
order, the contradiction to the order of two states in rank may appear only 
when generating the last state, i.e., the initial state of the automaton. This is 
very inconvenient, because a lot of generating work is going to be discarded and 
because of the backtracking strategy, the corresponding search tree is not pruned 
as it should. On the other hand, using the order described in Section 4 we can 
evaluate the possible contradictions after the complete generation of each rank 
of states. 

The algorithm goes as follows: 

— At the beginning of the generation of each rank, there are two data struc- 
tures: 

ProbL a set of lists of states that are equivalent and for which we want to 
ensure that the characteristic words that reach them are in accordance 
with that order; 

Ref s an empty set of lists of states that, in that rank, have transitions to 
states in some list in ProbL. 

— Every time two or more states with the same A are generated, they are 
added as a new list to ProbL. 

(A(si) = A(s 2 ) = ■■■ = A(si)) A (ip(si) < y{s 2 ) <■■■< ip(si)) => 

=> ProbL <— ProbL U {[si, s 2 , . . . , s{\}. 



— Every time a newly generated state has a transition to a state present in a 
list of ProbL, it is added to Ref s with information about the state it has a 
transition to. 

— When the state generation of a given rank is finished (because no more states 
in that rank can be generated according to rules N0-N6'), each list R in Ref s 
of states with transitions to states in a list L in ProbL is examined. 

• For x E L, let m(x) — min{((7, f s ) | (s,a) E 6~ 1 (x) A s E R}, where f s 
represents the finality of s. If for some pair of states of L a contradiction 
is found, i.e., 

(3x 1 ,x 2 E L)(3s 1 ,s 2 E i?) (3(7i, cr 2 G E) 
(5(si,ai) = x\ A 5(s2, 02) = x 2 A m{x\) < m(x 2 ) A <p(xi) > ip(x 2 )); 

then the generation of this automaton is aborted and the process is 
continued by backtracking. 

• For all the non-singleton sublists Mi a j\ of states in L such that 

(Vs G M (aJ) )m(x) = (a,f); 

its elements are removed from L, and the list of the states s of R such 
that (5(s, a) E M( CT j) A / s = /), with the order induced by L, is added 
to ProbL. 

• Finally, if 

(Vxi, x 2 E L)(3si, s 2 E i?)(3cri, (7 2 G S) 
((6(s 1 , (7i ) = x\ A S(s 2 , (7 2 ) = x 2 ) =4> (m(xi) < m(x 2 ) =4> ^(a;i) < </?(x 2 ))); 

then all the states in L that are the image of a transition from a state 
in R are removed from L, and i? is removed from Ref s. 

• All empty or singleton lists are removed from ProbL. 

• Before the generation of a new rank is started, Ref s is emptied. 

The correctness of this algorithm follows from the considerations in Section 4. 
5.1 Some experimental results 

In Table 1 the number of MADFAs and ADFAs for some small values of n 
and k are summarized. We observe that almost all ADFAs are MADFAs. Sev- 
eral performance times are also presented. For the enumeration of ADFAs and 
MADFAs instead of the exact generators, we also generate initially-connected 
deterministic automata (ICDFAs), using the method presented in Almeida et 
al. [1] , and then test for acyclicity, trimness and possibly, for minimality. But the 
number of IDFAs grows much faster then the number of ADFAs (or MADFAs), 
so the generate-test-reject method is not feasible. In column Time B of Table 1 
we present the running times obtained by this method (for small values of n and 
k). In column Time A of Table 1 we present the running times obtained by the 
exact generation methods. 
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1 


30 








3 


3900 


0.2 


1.6 


0.13 


3950 


0.51 


1.55 


4 


1460700 


7.7 


5549 


0.001 


1488120 


236 


5326 




k = 5 


n 


MADFA 


Time A (s) 


Time B (s) 


A/B 


ADFA 


Time A (s) 


Time B (s) 


2 


62 








1 


62 








3 


26164 


0.121 


2.396 


0.05 


26344 


3.4 


116 


4 


43023908 


213 






43411218 


6805 


4872111 



Table 1. Number of MADFAs and ADFAs for small values of n and k. Performance 
times for its generation: exact (A) and with a test-rejection pass (B). 



Considering only the performance times for MADFAs and k = 2, we obtained 
a curve fitting for both methods: for the exact generation a function f(n) = 
e 3.66n-20.76 anc [ £ or tne test-reject a function g(n) = e 4 - 2 i™-23.o^ w hi c h gives 
g(n)/f(n) = e °- 55n - 2 - 24 . 

As for the performance values we should only consider their order of mag- 
nitude as they were obtained using different CPUs and programs implemented 
in different programming languages. Both performance times B, were obtained 
using a C implementation and running on a AMD Athlon 64 at 2.5GHz. Per- 
formance times A were obtainned using a C++ implementation and running on 
a Intel® Xeon® 5140 at 2.33GHz, and a Python implementation running on 
a AMD Athlon 64 at 2.5GHz, respectively for MADFAs and for ADFAs (in 
general the C++ implementation for MADFA is two times faster than the corre- 
spondent Python implementation). 

It is reasonable that for (very) small values of n the test-reject method is 
faster, as the pruning of non legal ADFAs is a relatively costly operation. But 
because of the much faster growing of the number of ICDFAs (when compared 
with the number of ADFAs), that will not happen for larger ns. 

6 Counting ADFAs for n and k 

Let Afc(n) be the number of ADFAs with n states over an alphabet of k symbols 
and let Mk(n) be the corresponding number of MADFAs. In Almeida et al. [2], 
the values of M/~(n) were determined for n G [1,5]. The same kind of results 
can be obtained for A^n). The values of Aj.(n) for small values of n can be 
determined by considering the possible distribution of states by ranks and the 
number of dangling states that are targets of transitions from a state of a previous 
rank, for the first time. Using the Principle of Inclusion and Exclusion we have: 

A k (2) =M fc (2) = 2(2 fc - 1). 

A fe (3) =M fc (3) + (3 fe - 2 fe+1 + 1) = 2 2 (3 fe - 2 fe )(2 fe - 1) + (3 fe - 2 fe+1 + 1). 
A fe (4) =2 3 (4 fc - 3 fc )(3 fc - 2 fc )(2 fc - 1) + 2 2 (4 fe - 3 fe 2 + 2 fe )(2 fc - l) 2 
+2(4 fe - 3 fe )(3 fc - 2 fe 2 + 1) + (4 fe - 3 fe 3 + 2 fe 3 - l)/3. 

For n = 5 there are already 12 configurations to be considered. For values of n e 
[2,5], linifc^oo Mk(n)/Ak(n) = 1. We note that this behaviour is also observed 
(experimentally) in the case of arbitrary ICDFA's. 

7 Conclusions 

A canonical representation for minimal acyclic deterministic finite automata was 
extended to allow equivalent states, and thus uniquely represent trim acyclic 
deterministic finite automata. A method for the exact generation of MADFAs 
was extended to allow the generation of equivalent states, while still avoiding the 
multiple generation of non-isomorphic automata. More experimental tests must 
be carried on in order to see what really is the overhead of pruning non-legal 
equivalent states. 
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